146 resultados para Vector quantization

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the issue of complexity for vector quantization (VQ) of wide-band speech LSF (line spectrum frequency) parameters. The recently proposed switched split VQ (SSVQ) method provides better rate-distortion (R/D) performance than the traditional split VQ (SVQ) method, even at the requirement of lower computational complexity. but at the expense of much higher memory. We develop the two stage SVQ (TsSVQ) method, by which we gain both the memory and computational advantages and still retain good R/D performance. The proposed TsSVQ method uses a full dimensional quantizer in its first stage for exploiting all the higher dimensional coding advantages and then, uses an SVQ method for quantizing the residual vector in the second stage so as to reduce the complexity. We also develop a transform domain residual coding method in this two stage architecture such that it further reduces the computational complexity. To design an effective residual codebook in the second stage, variance normalization of Voronoi regions is carried out which leads to the design of two new methods, referred to as normalized two stage SVQ (NTsSVQ) and normalized two stage transform domain SVQ (NTsTrSVQ). These two new methods have complimentary strengths and hence, they are combined in a switched VQ mode which leads to the further improvement in R/D performance, but retaining the low complexity requirement. We evaluate the performances of new methods for wide-band speech LSF parameter quantization and show their advantages over established SVQ and SSVQ methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate the use of a two stage transform vector quantizer (TSTVQ) for coding of line spectral frequency (LSF) parameters in wideband speech coding. The first stage quantizer of TSTVQ, provides better matching of source distribution and the second stage quantizer provides additional coding gain through using an individual cluster specific decorrelating transform and variance normalization. Further coding gain is shown to be achieved by exploiting the slow time-varying nature of speech spectra and thus using inter-frame cluster continuity (ICC) property in the first stage of TSTVQ method. The proposed method saves 3-4 bits and reduces the computational complexity by 58-66%, compared to the traditional split vector quantizer (SVQ), but at the expense of 1.5-2.5 times of memory.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the issue of rate-distortion (R/D) performance optimality of the recently proposed switched split vector quantization (SSVQ) method. The distribution of the source is modeled using Gaussian mixture density and thus, the non-parametric SSVQ is analyzed in a parametric model based framework for achieving optimum R/D performance. Using high rate quantization theory, we derive the optimum bit allocation formulae for the intra-cluster split vector quantizer (SVQ) and the inter-cluster switching. For the wide-band speech line spectrum frequency (LSF) parameter quantization, it is shown that the Gaussian mixture model (GMM) based parametric SSVQ method provides 1 bit/vector advantage over the non-parametric SSVQ method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a new weighting function which is computationally simple and an approximation to the theoretically derived optimum weighting function shown in the literature. The proposed weighting function is perceptually motivated and provides improved vector quantization performance compared to several weighting functions proposed so far, for line spectrum frequency (LSF) parameter quantization of both clean and noisy speech data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A better performing product code vector quantization (VQ) method is proposed for coding the line spectrum frequency (LSF) parameters; the method is referred to as sequential split vector quantization (SeSVQ). The split sub-vectors of the full LSF vector are quantized in sequence and thus uses conditional distribution derived from the previous quantized sub-vectors. Unlike the traditional split vector quantization (SVQ) method, SeSVQ exploits the inter sub-vector correlation and thus provides improved rate-distortion performance, but at the expense of higher memory. We investigate the quantization performance of SeSVQ over traditional SVQ and transform domain split VQ (TrSVQ) methods. Compared to SVQ, SeSVQ saves 1 bit and nearly 3 bits, for telephone-band and wide-band speech coding applications respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper considers the design and analysis of a filter at the receiver of a source coding system to mitigate the excess Mean-Squared Error (MSE) distortion caused due to channel errors. It is assumed that the source encoder is channel-agnostic, i.e., that a Vector Quantization (VQ) based compression designed for a noiseless channel is employed. The index output by the source encoder is sent over a noisy memoryless discrete symmetric channel, and the possibly incorrect received index is decoded by the corresponding VQ decoder. The output of the VQ decoder is processed by a receive filter to obtain an estimate of the source instantiation. In the sequel, the optimum linear receive filter structure to minimize the overall MSE is derived, and shown to have a minimum-mean squared error receiver type structure. Further, expressions are derived for the resulting high-rate MSE performance. The performance is compared with the MSE obtained using conventional VQ as well as the channel optimized VQ. The accuracy of the expressions is demonstrated through Monte Carlo simulations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper considers the design and analysis of a filter at the receiver of a source coding system to mitigate the excess distortion caused due to channel errors. The index output by the source encoder is sent over a fading discrete binary symmetric channel and the possibly incorrect received index is mapped to the corresponding codeword by a Vector Quantization (VQ) decoder at the receiver. The output of the VQ decoder is then processed by a receive filter to obtain an estimate of the source instantiation. The distortion performance is analyzed for weighted mean square error (WMSE) and the optimum receive filter that minimizes the expected distortion is derived for two different cases of fading. It is shown that the performance of the system with the receive filter is strictly better than that of a conventional VQ and the difference becomes more significant as the number of bits transmitted increases. Theoretical expressions for an upper and lower bound on the WMSE performance of the system with the receive filter and a Rayleigh flat fading channel are derived. The design of a receive filter in the presence of channel mismatch is also studied and it is shown that a minimax solution is the one obtained by designing the receive filter for the worst possible channel. Simulation results are presented to validate the theoretical expressions and illustrate the benefits of receive filtering.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Communication applications are usually delay restricted, especially for the instance of musicians playing over the Internet. This requires a one-way delay of maximum 25 msec and also a high audio quality is desired at feasible bit rates. The ultra low delay (ULD) audio coding structure is well suited to this application and we investigate further the application of multistage vector quantization (MSVQ) to reach a bit rate range below 64 Kb/s, in a scalable manner. Results at 32 Kb/s and 64 Kb/s show that the trained codebook MSVQ performs best, better than KLT normalization followed by a simulated Gaussian MSVQ or simulated Gaussian MSVQ alone. The results also show that there is only a weak dependence on the training data, and that we indeed converge to the perceptual quality of our previous ULD coder at 64 Kb/s.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

High-rate analysis of channel-optimized vector quantizationThis paper considers the high-rate performance of channel optimized source coding for noisy discrete symmetric channels with random index assignment. Specifically, with mean squared error (MSE) as the performance metric, an upper bound on the asymptotic (i.e., high-rate) distortion is derived by assuming a general structure on the codebook. This structure enables extension of the analysis of the channel optimized source quantizer to one with a singular point density: for channels with small errors, the point density that minimizes the upper bound is continuous, while as the error rate increases, the point density becomes singular. The extent of the singularity is also characterized. The accuracy of the expressions obtained are verified through Monte Carlo simulations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper considers the high-rate performance of source coding for noisy discrete symmetric channels with random index assignment (IA). Accurate analytical models are developed to characterize the expected distortion performance of vector quantization (VQ) for a large class of distortion measures. It is shown that when the point density is continuous, the distortion can be approximated as the sum of the source quantization distortion and the channel-error induced distortion. Expressions are also derived for the continuous point density that minimizes the expected distortion. Next, for the case of mean squared error distortion, a more accurate analytical model for the distortion is derived by allowing the point density to have a singular component. The extent of the singularity is also characterized. These results provide analytical models for the expected distortion performance of both conventional VQ as well as for channel-optimized VQ. As a practical example, compression of the linear predictive coding parameters in the wideband speech spectrum is considered, with the log spectral distortion as performance metric. The theory is able to correctly predict the channel error rate that is permissible for operation at a particular level of distortion.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We develop a two stage split vector quantization method with optimum bit allocation, for achieving minimum computational complexity. This also results in much lower memory requirement than the recently proposed switched split vector quantization method. To improve the rate-distortion performance further, a region specific normalization is introduced, which results in 1 bit/vector improvement over the typical two stage split vector quantizer, for wide-band LSF quantization.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Further improvement in performance, to achieve near transparent quality LSF quantization, is shown to be possible by using a higher order two dimensional (2-D) prediction in the coefficient domain. The prediction is performed in a closed-loop manner so that the LSF reconstruction error is the same as the quantization error of the prediction residual. We show that an optimum 2-D predictor, exploiting both inter-frame and intra-frame correlations, performs better than existing predictive methods. Computationally efficient split vector quantization technique is used to implement the proposed 2-D prediction based method. We show further improvement in performance by using weighted Euclidean distance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Using analysis-by-synthesis (AbS) approach, we develop a soft decision based switched vector quantization (VQ) method for high quality and low complexity coding of wideband speech line spectral frequency (LSF) parameters. For each switching region, a low complexity transform domain split VQ (TrSVQ) is designed. The overall rate-distortion (R/D) performance optimality of new switched quantizer is addressed in the Gaussian mixture model (GMM) based parametric framework. In the AbS approach, the reduction of quantization complexity is achieved through the use of nearest neighbor (NN) TrSVQs and splitting the transform domain vector into higher number of subvectors. Compared to the current LSF quantization methods, the new method is shown to provide competitive or better trade-off between R/D performance and complexity.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We develop a Gaussian mixture model (GMM) based vector quantization (VQ) method for coding wideband speech line spectrum frequency (LSF) parameters at low complexity. The PDF of LSF source vector is modeled using the Gaussian mixture (GM) density with higher number of uncorrelated Gaussian mixtures and an optimum scalar quantizer (SQ) is designed for each Gaussian mixture. The reduction of quantization complexity is achieved using the relevant subset of available optimum SQs. For an input vector, the subset of quantizers is chosen using nearest neighbor criteria. The developed method is compared with the recent VQ methods and shown to provide high quality rate-distortion (R/D) performance at lower complexity. In addition, the developed method also provides the advantages of bitrate scalability and rate-independent complexity.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We revisit the problem of temporal self organization using activity diffusion based on the neural gas (NGAS) algorithm. Using a potential function formulation motivated by a spatio-temporal metric, we derive an adaptation rule for dynamic vector quantization of data. Simulations results show that our algorithm learns the input distribution and time correlation much faster compared to the static neural gas method over the same data sequence under similar training conditions.